In this Rmarkdown document we are going to use MAGIC to better visualize genes and ease with the annotation when using specific marker genes. MAGIC was developed by Smita Krishnaswamy’s lab to try to fill in the drop out reads in the spots. MAGIC is a Markov Affinity-based Graph Imputation of Cells used for denoising high-dimensional data most commonly applied to single-cell RNA sequencing data. MAGIC learns the manifold data, using the resultant graph to smooth the features and restore the structure of the data based on their k-nearest neighbors.
library(Seurat)
library(dplyr)
library(ggplot2)
Loading Rmagic
library(reticulate)
# conda create -n MAGIC python=3.7
# conda install -c bioconda scprep
# conda install matplotlib future tasklogger graphtools scipy pandas Deprecated pygsp pytz python-dateutil six threadpoolctl joblib decorator wrapt cycler kiwisolver pyparsing pillow
# conda install -c anaconda zlib
# ~/anaconda3/envs/MAGIC/bin/pip3 install magic-impute
#
# path_to_python <- "/media/data1/anaconda3/envs/MAGIC"
path_to_python <- "/scratch/groups/hheyn/software/anaconda3/envs/spatial_r/"
# reticulate::use_python(path_to_python, required = TRUE)
reticulate::use_condaenv(path_to_python)
reticulate::py_discover_config(required_module = "magic")
## python: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python
## libpython: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/lib/libpython3.9.so
## pythonhome: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r:/scratch/groups/hheyn/software/anaconda3/envs/spatial_r
## version: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0]
## numpy: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/lib/python3.9/site-packages/numpy
## numpy_version: 1.20.3
## magic: /home/devel/melosua/.local/lib/python3.9/site-packages/magic
##
## python versions found:
## /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python
## /home/devel/melosua/anaconda3/envs/r-reticulate/bin/python
## /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python3
## /usr/bin/python
## /home/devel/melosua/anaconda3/bin/python
## /home/devel/melosua/anaconda3/envs/MAGIC/bin/python
## /home/devel/melosua/anaconda3/envs/scrublet/bin/python
## /home/devel/melosua/anaconda3/envs/sLDA_tutorial/bin/python
## /home/devel/melosua/anaconda3/envs/spaceranger/bin/python
## /home/devel/melosua/anaconda3/envs/spatial_transcriptomics/bin/python
reticulate::py_config()
## python: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python
## libpython: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/lib/libpython3.9.so
## pythonhome: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r:/scratch/groups/hheyn/software/anaconda3/envs/spatial_r
## version: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0]
## numpy: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/lib/python3.9/site-packages/numpy
## numpy_version: 1.20.3
##
## python versions found:
## /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python
## /home/devel/melosua/anaconda3/envs/r-reticulate/bin/python
## /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/bin/python3
## /usr/bin/python
## /home/devel/melosua/anaconda3/bin/python
## /home/devel/melosua/anaconda3/envs/MAGIC/bin/python
## /home/devel/melosua/anaconda3/envs/scrublet/bin/python
## /home/devel/melosua/anaconda3/envs/sLDA_tutorial/bin/python
## /home/devel/melosua/anaconda3/envs/spaceranger/bin/python
## /home/devel/melosua/anaconda3/envs/spatial_transcriptomics/bin/python
library(Rmagic)
set.seed(123)
source(here::here("misc/paths.R"))
source(here::here("utils/bin.R"))
"{plasma}/{plt_dir}" %>%
glue::glue() %>%
here::here() %>%
dir.create(path = .,
showWarnings = FALSE,
recursive = TRUE)
"{plasma}/{robj_dir}" %>%
glue::glue() %>%
here::here() %>%
dir.create(path = .,
showWarnings = FALSE,
recursive = TRUE)
The data used in this Rmarkdown document comes from 03-clustering_integration.Rmd where the data was integrated.
merged_se <- "{anot}/{robj_dir}/integrated_spatial_annot.rds" %>%
glue::glue() %>%
here::here() %>%
readRDS(file = .)
We are going to use manually selected marker by the annotation team to try to spot specific Plasma cell related populations
We’re also going to use the following markers from these papers:
A Spatially Resolved Dark- versus Light-Zone Microenvironment Signature Subdivides Germinal Center-Related Aggressive B Cell Lymphomas - Dark Zone: OAZ1, AICDA, H3, MKI67, POLH - Light Zone: LAG3, ITGB8, PDCD1, TIGIT, BCL2, PECAM1, LY6E, B7-H3 (CD276), HLA-DRB1, PSMB10, TNF, ARG1, HLA-E, STAT1
"{plasma}/gene_dict_plasma.R" %>%
glue::glue() %>%
here::here() %>%
source(file = .)
plasma_vec
## [1] "SUGCT" "AICDA" "CXCR4" "LMO2" "CD83" "BCL2A1" "BCL6" "IRF8" "MEF2B" "MS4A1" "PAX5" "PRDM1" "XBP1" "IRF4" "SLAMF7" "SSR4" "MZB1" "DERL3" "CREB3L2" "IGHG1" "IGHG2" "IGHG3" "IGHG4" "IGHA1" "IGHA2" "IGHM" "IGHD" "IGHV3-20" "IGHV3-43" "CD9" "CD44" "BANK1" "CELF2" "TXNIP" "MKI67" "TOP2A" "TUBA1B"
# "{cd4}/gene_dict.R" %>%
# glue::glue() %>%
# here::here() %>%
# source(file = .)
# gene_vec
# gene_vec <- unique(c(plasma_vec, gene_vec))
gene_vec <- intersect(plasma_vec, rownames(merged_se))
We are going to remove some spots that may be problematic when carrying out MAGIC. Since we are ultimately diffusing the gene expression specific spots with very high expression may be driving the MAGIC expression a lot and we want to avoid it. As we see below there are 1 or 2 spots per slice with an abnormal expression of that genes so we are going to remove them.
Seurat::SpatialFeaturePlot(object = merged_se,
features = c(plasma_genes[["DZ"]], "TOP2A"),
alpha = c(0, 1),
ncol = 5,
images = "esvq52_nluss5")
Look at this for all the tissue slices
lapply(unique(merged_se$gem_id), function(i){
gene_plt <- Seurat::SpatialFeaturePlot(
object = merged_se,
features = gene_vec,
alpha = c(0, 1),
ncol = 5,
images = i)
"{plasma}/{plt_dir}/plasma_markers_{i}.pdf" %>%
glue::glue() %>%
here::here() %>%
cowplot::save_plot(
filename = .,
plot = gene_plt,
base_height = 35,
base_width = 25)
})
Run MAGIC, it is recommended by the developers to run it within sample so we are going to run it separately for each one. This follows the same principle as why we want to run SCTransform in samples individually, we don’t want information leaking from one dataset to another even if they are healthy tonsils that should be homogeneous.
se <- merged_se
magic_ls <- lapply(id_sp_df$gem_id, function(id) {
print(id)
sub_se <- se[, se$gem_id == id]
# Remove 0 genes
sum_vec <- sparseMatrixStats::rowSums2(sub_se@assays$Spatial@counts[gene_vec, ])
gene_tmp <- gene_vec[sum_vec > 0]
data_magic <- Rmagic::magic(
data = sub_se,
assay = "Spatial",
counts = "data",
genes = gene_tmp,
knn = 2,
knn.max = NULL,
decay = 1,
# Set t = 2 for minimal diffusion
t = 5,
npca = 100,
init = NULL,
t.max = 20,
knn.dist.method = "euclidean",
verbose = 1,
n.jobs = 1,
seed = 123)
tmp_mtrx <- data_magic@assays$MAGIC_Spatial@data
data.frame(tmp_mtrx, check.names = FALSE)
})
## [1] "tarwe1_xott6q"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "c28w2r_7jne4i"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "esvq52_nluss5"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "p7hv1g_tjgmyj"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "gcyl7c_cec61b"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "zrt7gl_lhyyar"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "qvwc8t_2vsr67"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
## [1] "exvyh1_66caqq"
## [1] "Added MAGIC output to MAGIC_Spatial. To use it, pass assay='MAGIC_Spatial' to downstream methods or set seurat_object@active.assay <- 'MAGIC_Spatial'."
# Combine all the matrices
# magic_df <- lapply(magic_ls, function(i) {
# i <- t(i)
# if (ncol(i) < length(gene_vec)) {
# # Add 0 to those genes not present in the slide
# g <- gene_vec[!gene_vec %in% colnames(i)]
# i[, g] <- 0
# tmp <- data.frame(i, check.names = FALSE)
# } else {
# tmp <- data.frame(i, check.names = FALSE)
# }
# tmp
# } ) %>%
# dplyr::bind_cols()
# https://stackoverflow.com/questions/14783606/merge-multiple-data-frame-by-row-in-r
magic_df <- Reduce(function(a,b){
ans <- merge(a, b, by = "row.names", all = TRUE)
row.names(ans) <- ans[, "Row.names"]
ans[, !names(ans) %in% "Row.names"]
}, magic_ls)
# Replace all NA for 0
magic_df[is.na(magic_df)] <- 0
"{plasma}/{robj_dir}/MAGIC-mtrx.rds" %>%
glue::glue() %>%
here::here() %>%
saveRDS(object = magic_df, file = .)
sessionInfo()
## R version 4.0.1 (2020-06-06)
## Platform: x86_64-conda_cos6-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux Server release 6.7 (Santiago)
##
## Matrix products: default
## BLAS/LAPACK: /scratch/groups/hheyn/software/anaconda3/envs/spatial_r/lib/libopenblasp-r0.3.12.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Rmagic_2.0.3 Matrix_1.4-0 reticulate_1.22 ggplot2_3.3.5 dplyr_1.0.7 SeuratObject_4.0.4 Seurat_4.0.5
##
## loaded via a namespace (and not attached):
## [1] Rtsne_0.15 colorspace_2.0-2 deldir_1.0-6 ellipsis_0.3.2 ggridges_0.5.3 rprojroot_2.0.2 spatstat.data_2.1-0 farver_2.1.0 leiden_0.3.9 listenv_0.8.0 bit64_4.0.5 ggrepel_0.9.1 fansi_0.5.0 sparseMatrixStats_1.2.1 codetools_0.2-18 splines_4.0.1 knitr_1.36 polyclip_1.10-0 jsonlite_1.7.2 ica_1.0-2 cluster_2.1.2 png_0.1-7 uwot_0.1.11 shiny_1.7.1 sctransform_0.3.2 spatstat.sparse_2.0-0 readr_2.1.1 compiler_4.0.1 httr_1.4.2 assertthat_0.2.1 fastmap_1.1.0 lazyeval_0.2.2 cli_3.1.0 later_1.3.0 htmltools_0.5.2 tools_4.0.1 igraph_1.2.10 gtable_0.3.0 glue_1.5.1 RANN_2.6.1 reshape2_1.4.4 rappdirs_0.3.3 Rcpp_1.0.7 scattermore_0.7 jquerylib_0.1.4 vctrs_0.3.8 nlme_3.1-153 lmtest_0.9-39 xfun_0.29
## [50] stringr_1.4.0 globals_0.14.0 mime_0.12 miniUI_0.1.1.1 lifecycle_1.0.1 irlba_2.3.5 goftest_1.2-3 future_1.23.0 MASS_7.3-54 zoo_1.8-9 scales_1.1.1 vroom_1.5.7 spatstat.core_2.3-2 MatrixGenerics_1.2.1 hms_1.1.1 promises_1.2.0.1 spatstat.utils_2.3-0 parallel_4.0.1 RColorBrewer_1.1-2 yaml_2.2.1 pbapply_1.5-0 gridExtra_2.3 sass_0.4.0 rpart_4.1-15 stringi_1.7.6 highr_0.9 rlang_0.4.12 pkgconfig_2.0.3 matrixStats_0.61.0 evaluate_0.14 lattice_0.20-45 ROCR_1.0-11 purrr_0.3.4 tensor_1.5 labeling_0.4.2 patchwork_1.1.1 htmlwidgets_1.5.4 bit_4.0.4 cowplot_1.1.1 tidyselect_1.1.1 here_1.0.1 parallelly_1.29.0 RcppAnnoy_0.0.19 plyr_1.8.6 magrittr_2.0.1 R6_2.5.1 generics_0.1.1 DBI_1.1.1 pillar_1.6.4
## [99] withr_2.4.3 mgcv_1.8-38 fitdistrplus_1.1-6 survival_3.2-13 abind_1.4-5 tibble_3.1.6 future.apply_1.8.1 crayon_1.4.2 KernSmooth_2.23-20 utf8_1.2.2 spatstat.geom_2.3-1 plotly_4.10.0 tzdb_0.2.0 rmarkdown_2.11 grid_4.0.1 data.table_1.14.2 digest_0.6.29 xtable_1.8-4 tidyr_1.1.4 httpuv_1.6.4 munsell_0.5.0 viridisLite_0.4.0 bslib_0.3.1